Combining similarity in time and space for training set formation under concept drift

نویسنده

  • Indre Zliobaite
چکیده

Concept drift is a challenge in supervised learning for sequential data. It describes a phenomenon when the data distributions change over time. In such a case accuracy of a classifier benefits from the selective sampling for training. We develop a method for training set selection, particularly relevant when the expected drift is gradual. Training set selection at each time step is based on the distance to the target instance. The distance function combines similarity in space and in time. The method determines an optimal training set size online at every time step using cross validation. It is a wrapper approach, it can be used plugging in different base classifiers. The proposed method shows the best accuracy in the peer group on the real and artificial drifting data. The method complexity is reasonable for the field applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Concept Drift Using Adaptive Training Set Formation Strategy

We live in a dynamic world, where changes are a part of everyday life. When there is a shift in data, the classification or prediction models need to be adaptive to the changes. In data mining the phenomenon of change in data distribution over time is known as concept drift. In this research, the authors propose an adaptive supervised learning with delayed labeling methodology. As a part of thi...

متن کامل

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

افت در پارامترهای سؤال: مفاهیم، روش‌شناسی و شناسایی

Item Parameter Drift occurs over time for various reasons; when test items lose their initial characteristics, such as difficulty and discrimination parameters. Including cases of item parameter drift are revealed, excessive repetition, changes in the education system, and the position of items and the parameters of poor initialization. Item parameter drift causes of the invariance to be violat...

متن کامل

Dimensional Similarity in the Study of Microbubble Production Inside Venturi Tube

The present study considers of the water and air flow and Micro-Bubble production inside the venturi tube, by the use of dimensional analysis. Numerical analysis of Micro-Bubble creation in venturi tube requires fast computers and large amounts of storage space. Up to now, there has been no numerical analysis concerning Micro-Bubble creation and all other existing studies are experimental. To s...

متن کامل

مسأله حضور در فضا: آگاهی و عاملیت فضایی با تاکید بر فضای عمومی شهری

Public space is the realm of the concrete and substantial presence of the different social groups with different behavior patterns. The concept of space in this sense is an entity that, by the people and through individual and collective action and social relations are formed. The presence of people in the space-in a way that is free from domination, could Strength the urban life. this paper, b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Intell. Data Anal.

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2011